Speeding Up Materialized View Selection in Data Warehouses Using a Randomized Algorithm
نویسندگان
چکیده
A data warehouse stores information that is collected from multiple, heterogeneous information sources for the purpose of complex querying and analysis. Information in the warehouse is typically stored in the form of materialized views, which represent pre-computed portions of frequently asked queries. One of the most important tasks when designing a warehouse is the selection of materialized views to be maintained in the warehouse. The goal is to select a set of views in such a way as to minimize the total query response time over all queries, given a limited amount of time for maintaining the views (maintenance-cost view selection problem). In this paper, we propose an efficient solution to the maintenance-cost view selection problem using a genetic algorithm for computing a near-optimal set of views. Specifically, we explore the maintenance-cost view selection problem in the context of OR view graphs. We show that our approach represents a dramatic improvement in time complexity over existing search-based approaches using heuristics. Our analysis shows that the algorithm consistently yields a solution that lies within 10% of the optimal query benefit while at the same time exhibiting only a linear increase in execution time. We have implemented a prototype version of our algorithm which is used to simulate the measurements used in the analysis of our approach.
منابع مشابه
Speeding up Warehouse Physical Design Using a Randomized Algorithm
A data warehouse stores information that is collected from multiple, heterogeneous information sources for the purpose of complex querying and analysis. Information in the warehouse is typically stored in the form of materialized views. One of the most important tasks when designing a warehouse is the selection of materialized views to be maintained in the warehouse. The goal is to select a set...
متن کاملUsing Relational Database Constraints to Design Materialized Views in Data Warehouses
Queries to data warehouses often involve hundreds of complex aggregations over large volumes of data, and so it is infeasible to compute these queries by scanning the data sources each time. Data warehouses therefore build a large number of materialized views to increase system performance. However, materialized views need to be immediately updated when its sources are changed, leading to a pos...
متن کاملTSGV: a table-like structure-based greedy method for materialized view selection in data warehouses
Since a data warehouse deals with huge amounts of data and complex analytical queries, online processing and answering to users’ queries in data warehouses can be a serious challenge. Materialized views are used to speed up query processing rather than direct access to the database in on-line analytical processing. Since the large number and high volume of views prevents all of the views from b...
متن کاملRewriting OLAP Queries Using Materialized Views and Dimension Hierarchies in Data Warehouses
OLAP queries involve a lot of aggregations on a large amount of data in data warehouses. To process expensive OLAP queries efficiently, we propose a new method for rewriting a given OLAP query using various kinds of materialized aggregate views which already exist in data warehouses. We first define the normal forms of OLAP queries and materialized views based on the lattice of dimension hierar...
متن کاملA Solution to View Management to Build a Data Warehouse
Several techniques exist to select and materialize a proper set of data in a suitable structure that manage the queries submitted to the online analytical processing systems. These techniques are called view management techniques, which consist of three research areas: 1) view selection to materialize, 2) query processing and rewriting using the materialized views, and 3) maintaining materializ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Int. J. Cooperative Inf. Syst.
دوره 10 شماره
صفحات -
تاریخ انتشار 2001